Project-Team:WIMMICS

Inria | Raweb 2019 | Presentation of the Project-Team WIMMICS | WIMMICS Web Site


	PDF	e-Pub

Previous |

Home | Next next

Section: New Results

Communities and Social Interactions Analysis

Fake News Detection

Participants : Elena Cabrio, Serena Villata, Jérôme Delobelle.

This work is part of the DGA project RAPID CONFIRMA (COntre argumentation contre les Fausses InfoRMAtion) aiming to automatically detect fake news and limit their diffusion. In this purpose, a framework is developed to detect fake news, to reduce their propagation and to propose the best response strategies. Thus, in addition to identifying the communities propagating these fake news, our goal is to propose a method to convince a person that the information is actually false is a key element in fighting the spread of such a kind of dangerous information. To achieve this goal, we orientate our research towards the generation of counter-argumentation. Counter-argumentation is a process aiming to put forward counter-arguments in order to provide evidences against a certain argument previously proposed. In the case of fake news, in order to convince a person that the (fake) information is true, the author of the fake news will use different methods of persuasion via arguments. Thus, identifying these arguments and attacking them by using carefully constructed arguments from safe sources is a way to fight this phenomenon and its spread along the social network. More precisely, we have identified four steps to address the counter-argumentation process: (1) Identifying the arguments used in the fake news (Argument mining); (2) Determining, for each of the arguments, whether it is for or against the topic of the fake news (Stance detection); (3) Identifying the key arguments that our system must attack (Classification task); and (4) Providing a set of arguments from safe sources to attack the targeted fake arguments (Counter-Argumentation).

We are also interested in studying, from a formal point of view, how to cast the notion of interpretability (i.e. the degree to which an observer can understand the cause(s) of a result) in abstract argumentation so that the reasons leading to the acceptability of one or a set of arguments in a framework (returned by a particular semantics) may be explicitly assessed [13]. More precisely, this research question breaks down into the following sub-questions: (i) how to formally define and characterise the notion of impact of an argument with respect to the acceptability of the other arguments in the framework? and (ii) how does this impact play a role in the interpretation process of the acceptability of arguments in the framework?

Hate Speech Detection

Participants : Elena Cabrio, Alain Giboin, Sara Tonelli, Michele Corazza, Pinar Arslan, Stefano Menini.

On the topic of cyberbullying event detection and hate speech detection, we proposed a message-level cyberbullying annotation on an Instagram dataset. Moreover, we used the correlations on the Instagram dataset annotated with emotion, sentiment and bullying labels. Finally, we built a message-level emotion classifier automatically predicting emotion labels for each comment in the Vine bullying dataset. We built a session-based bullying classifier with the use of n-grams, emotion, sentiment and concept-level features. For both emotion and bullying classifiers, we used Linear Support Vector Classification. Our results showed that “anger” and “negative” labels have a positive correlation with the presence of bullying. Concept-level features, emotion and sentiment features in different levels contribute to the bullying classifier, especially to the bullying class. Our best performing bullying classifier with n-grams and concept-level features (e.g., polarity, averaged polarity intensity, moodtags and semantics features) reached to an F1-score of 0.65 for bullying class and a macro average F1-score of 0.7520. The results of this research have been published at SAC 2019 [7].

Together with some colleagues at FBK Trento, we performed a comparative evaluation on datasets for hate speech detection in Italian, extracted from four different social media platforms, i.e. Facebook, Twitter, Instagram and WhatsApp. We showed that combining such platform-dependent datasets to take advantage of training data developed for other platforms is beneficial, although their impact varies depending on the social network under consideration. The results of this research have been published at SAC 2019 [11].

Previous |

Home | Next next